Extended Topic Model for Word Dependency

نویسندگان

  • Tong Wang
  • Vish Viswanath
  • Ping Chen
چکیده

Topic Model such as Latent Dirichlet Allocation(LDA) makes assumption that topic assignment of different words are conditionally independent. In this paper, we propose a new model Extended Global Topic Random Field (EGTRF) to model non-linear dependencies between words. Specifically, we parse sentences into dependency trees and represent them as a graph, and assume the topic assignment of a word is influenced by its adjacent words and distance-2 words. Word similarity information learned from large corpus is incorporated to enhance word topic assignment. Parameters are estimated efficiently by variational inference and experimental results on two datasets show EGTRF achieves lower perplexity and higher log predictive probability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجره‌های هم‌پوشان

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

متن کامل

A Maximum Entropy Language Model with Topic Sensitive Features

We present a maximum entropy approach to topic sensitive language modeling. By classifying the training data into di erent parts according to topic, extracting topic sensitive unigram features, and combining these new features with conventional N-grams in language modeling, we build a topic sensitive bigram model. This model improves both perplexity and word error rate. keywords: maximum entrop...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

The Open University ’ s repository of research publications and other research outputs Dynamic joint sentiment - topic model

Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online's data policy on reuse of materials please consult the policies page. Social media data are produced continuously by a large and uncontrolled number of users. The dynamic nature of such data requires the sentiment and topic a...

متن کامل

Structured Topic Models for Language

This thesis introduces new methods for statistically modelling text using topic models. Topic models have seen many successes in recent years, and are used in a variety of applications, including analysis of news articles, topic-based search interfaces and navigation tools for digital libraries. Despite these recent successes, the field of topic modelling is still relatively new and there remai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015